Comprehensive Data Cleansing

نویسندگان

  • Heiko Müller
  • Johann-Christoph Freytag
چکیده

Cleansing data from impurities is an integral part of data processing and maintenance. This has lead to the development of a broad range of methods intending to enhance the accuracy and thereby the usability of existing data. This paper presents a survey of data cleansing problems, approaches, and methods. We classify the various types of anomalies occurring in data that have to be eliminated, and we define a set of quality criteria that comprehensively cleansed data has to accomplish. Based on this classification we evaluate and compare existing approaches for data cleansing with respect to the types of anomalies handled and eliminated by them. We also describe in general the different steps in data cleansing and specify the methods used within the cleansing process and give an outlook to research directions that complement the existing systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cleansing and preparation of data for statistical analysis: A step necessary in oral health sciences research

In many published articles, there is still no mention of quality control processes, which might be an indication of the insufficient importance the researchers attach to undertaking or reporting such processes. However, quality control of data is one of the most important steps in research projects. Lack of sufficient attention to quality control of data might have a detrimental effect on the r...

متن کامل

DTNC: A New Server-side Data Cleansing Framework for Cellular Trajectory Services

It is essential for the cellular network operators to provide cellular location services to meet the needs of their users and mobile applications. However, cellular locations, estimated by network-based methods at the server-side, bear with high spatial errors and arbitrary missing locations. Moreover, auxiliary sensor data at the client-side are not available to the operators. In this paper, w...

متن کامل

Reducing the Risk of Insider Misuse by Revising Identity Management and User Account Data

To avoid insider computer misuse, identity and authorization data referring to the legitimate users of the systems must be properly organized and constantly and systematically analyzed and evaluated. In order to support this, a methodology for structured Identity Management has been developed. This methodology includes gathering of identity data spread among different applications, systematic c...

متن کامل

Minimizing insider misuse through secure Identity Management

To avoid insider computer misuse, identity, and authorization data referring to the legitimate users of systems must be properly organized, constantly and systematically analyzed, and evaluated. In order to support this, structured and secure Identity Management is required. A comprehensive methodology supporting Identity Management within organizations has been developed, including gathering o...

متن کامل

Effect of Denture Cleansing Solutions on Retention of Two Types of Overdenture Attachments

Background and Aim: There is lack of information regarding the effect of different denture cleansing solutions on the retention of attachments. This study aimed to as-sess the effect of denture cleansing solutions on the retention of Dalbo-Plus and Loca-tor attachment systems. Materials and Methods: This study evaluated 160 attachments including 80 Locator and 80 Dalbo-Plus attachment systems....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005